A Kolmogorov-Smirnov Correlation-Based Filter for Microarray Data

نویسندگان

Jacek Biesiada

Wlodzislaw Duch

چکیده

A filter algorithm using F-measure has been used with feature redundancy removal based on the Kolmogorov-Smirnov (KS) test for rough equality of statistical distributions. As a result computationally efficient K-S CorrelationBased Selection algorithm has been developed and tested on three high-dimensional microarray datasets using four types of classifiers. Results are quite encouraging and several improvements are suggested.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feature Selection for High-Dimensional Data: A Kolmogorov-Smirnov Correlation-Based Filter

An algorithm for filtering information based on the Kolmogorov-Smirnov correlation-based approach has been implemented and tested on feature selection. The only parameter of this algorithm is statistical confidence level that two distributions are identical. Empirical comparisons with 4 other state-of-the-art features selection algorithms (FCBF, CorrSF, ReliefF and ConnSF) are very encouraging.

متن کامل

Important Features PCA for high dimensional clustering

We consider a clustering problem where we observe feature vectors Xi ∈ R, i = 1, 2, . . . , n, from K possible classes. The class labels are unknown and the main interest is to estimate them. We are primarily interested in the modern regime of p n, where classical clustering methods face challenges. We propose Important Features PCA (IF-PCA) as a new clustering procedure. In IFPCA, we select a ...

متن کامل

Influential Features Pca for High Dimensional Clustering

We consider a clustering problem where we observe feature vectors Xi ∈ R, i = 1, 2, . . . , n, from K possible classes. The class labels are unknown and the main interest is to estimate them. We are primarily interested in the modern regime of p n, where classical clustering methods face challenges. We propose Influential Features PCA (IF-PCA) as a new clustering procedure. In IF-PCA, we select...

متن کامل

A permutation test motivated by microarray data analysis

We introduce a nonparametric test intended for large-scale simultaneous inference in situations where the utility of distribution-free tests is limited because of their discrete nature. Such situations are frequently dealt with in microarray analysis where the number of tests is much larger than the sample size. The proposed test statistic is based on a certain distance between the distribution...

متن کامل

SFLA Based Gene Selection Approach for Improving Cancer Classification Accuracy

In this paper, we propose a new gene selection algorithm based on Shuffled Frog Leaping Algorithm that is called SFLA-FS. The proposed algorithm is used for improving cancer classification accuracy. Most of the biological datasets such as cancer datasets have a large number of genes and few samples. However, most of these genes are not usable in some tasks for example in cancer classification....

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2007

A Kolmogorov-Smirnov Correlation-Based Filter for Microarray Data

نویسندگان

چکیده

منابع مشابه

Feature Selection for High-Dimensional Data: A Kolmogorov-Smirnov Correlation-Based Filter

Important Features PCA for high dimensional clustering

Influential Features Pca for High Dimensional Clustering

A permutation test motivated by microarray data analysis

SFLA Based Gene Selection Approach for Improving Cancer Classification Accuracy

عنوان ژورنال:

اشتراک گذاری